Dissertation Memo
1 Introduction
1.1 Purpose
Since defending my prospectus, I’ve developed, deployed, and refined a raft of models that carry out my express research agenda. The object of this memo is to describe these models and both illustrate and comment on my central findings.
1.2 Research Questions
Before proceeding, it may be useful to recall the contents of my research agenda. To this end, below are concise formulations of the questions animating each chapter (in order):
- Do human rights conditions in preferential trade agreements (PTAs), particularly when codified as more legalized (i.e., “harder”), improve human rights respect among signatories?
- Do bilateral investment treaties (BITs) improve human rights respect among signatories?
- Do economic sanctions sent by Western states improve human rights respect among non-Western targets?
1.3 Key Findings
As we shall see, my models (as presently construed) suggest the following general answers to my research questions:
- “Harder” human rights conditions in PTAs are associated with improved human rights respect in less-democratic states but a worsening of such respect in more-democratic ones.
- Likewise, BITs are associated with improved human rights respect in less-democratic states but a worsening of such respect in more-democratic ones.
- Conversely, economic sanctions against non-Western states are associated with worsened human rights respect in less-democratic targets but an improvement of such respect in more-democratic ones.
1.4 Overview of Memo
The remainder of my memo is organized as follows:
2 The General Methodology
2.1 Summary
The following section delineates the overarching methodology common to my work in all three chapters:
- Replication (Section 2.2)
- Preprocessing (Section 2.3)
- Data wrangling (Section 2.3.1)
- Multiple imputation (Section 2.3.2)
- Spatial lagging (Section 2.3.3)
- Start-Year Specification (Section 2.3.4)
- Temporal lagging (Section 2.3.5)
- Double/debiased machine learning (Section 2.4)
- Pooling (Section 2.4)
2.2 Replication
In each chapter, I first run models that approximately replicate those of the literature inspiring my work, the main points of departure (where appropriate) being (1.) the substitution of HR Scores for the authors’ outcomes and my selected treatments for theirs, and (2.) the use of both unimputed (in line with the authors’ methodologies) and multiply-imputed data. I do so to test whether the introduction of new key variables and/or multiple imputation immediately yields divergent results. More on these replications can be found in Section 3.3, Section 4.3, and Section 5.3.
2.3 Preprocessing
2.3.1 Data Wrangling
2.3.1.1 Treatments
Before running the replication and novel models, I assemble a dataset, \(\textbf{X}\), through data wrangling. The first (and perhaps most involved) step in this process is the construction of each treatment variable, \(D\), a task which generally demands significant transformations to the raw data to ultimately render them in country-year (i.e., panel) format. How these treatments are sourced and generated—and how missingness is handled—is discussed further in Section 3.1, Section 4.1, and Section 5.1.
2.3.1.2 Covariates
Afterwards, I consolidate the outcome, \(Y\), and covariates, \(X = (X_{i}, ..., X_{n})\). I also create additional covariates where necessary. \(X\) is nearly identical across the three chapters, featuring the following common elements:
- All V-Dem high-level and mid-level indices.1
- From the Polity Project, Polity Score and Regime Durability.2
- World Development Indicators (World Bank) pertaining to:3
- From Fariss et al. (2022), latent estimates of GDP, GDPpc, and population (logged).
- Balance of payments.6
1 \(n = 26\)
2 The proximate sources of these variables are V-Dem and the Quality of Governance (QoG) dataset, respectively.
3 These variables are sourced from the QoG.
4 Sum of exports and imports of goods and services, % of country-year GDP.
5 Net inflows and outflows, % of country-year GDP.
6 % of country-year GDP.
Justifications for the inclusion of these covariates, as well as exceptions to the common elements of \(X\), are present in Section 3.2, Section 4.2, and Section 5.2.
2.3.1.3 Case Selection & Coding
During the assembly of \(\textbf{X}\), I make several decisions on handling observations featuring missingness in numerous or critical areas and irregular coding attributes. Most of these cases involve (1.) small-nation status, (2.) partial international recognition, and (3.) post-Cold War transitions.
In the first instance, I omit from \(\textbf{X}\) all countries failing to appear in V-Dem, coverage in HR Scores notwithstanding. Constituting the lion’s share of these cases are microstates, as classified by the Correlates of War (COW) project. Though this omission comes at the cost of circumscribing my studies’ external validity—namely, to non-microstates—it likely improves the robustness of my final inferences vis-à-vis future replications. Indeed, V-Dem indices constitute the majority of dimensions in each chapter’s \(X\), meaning conclusions drawn from imputed V-Dem data for microstates would be generally and greatly sensitive to the randomness immanent in the imputation process.7
7 More on my missingness handling method—multiple imputation—may be found in Section 2.3.2.
8 A discussion of my spatial models is to be found in Section 2.3.3.
9 For a discussion of Palestine’s exclusion, see the list’s attendant case description.
10 Only states with accompanying Gleditsch and Ward IDs (i.e., “state numbers”) appear in Fariss et al.’s (2022) data.
11 The computation of these treatments is outlined in Section 3.1 and Section 4.1.
Moreover, I omit Palestine in view of problems originating from inconsistent coding patterns, themselves a general consequence of its incomplete international recognition. Unlike the foregoing microstates, Palestine is covered in V-Dem. However, its reported scores are disaggregated for much of the post-World War II period, with Gaza and the West Bank treated as mutually-exclusive observations, making it unclear as to how to reconcile these data with the single spatial unit that the source of my spatial data, Natural Earth, supplies for Palestine.8 Equally important, Palestine—absent from Gleditsch & Ward’s List of Independent States (v7) on the basis of its contested international recognition9—lacks the prerequisite for coverage in Fariss et al.’s (2022) GDP, GDPpc, and population variables,10 estimates essential for computing several treatments pre-imputation.11 Owing to these theoretical and logistical roadblocks, I abstain from incorporating Palestine into my models, with the understanding that doing so comes at some inferential cost.
Included in some of my models, on the other hand, are a few extinct countries, namely East Germany (formally the German Democratic Republic) and South Yemen (formally the People’s Democratic Republic of Yemen). Each was a communist state that unified with its non-communist counterpart (West Germany and North Yemen, respectively) in 1990, amidst the denouement of the Cold War. The two countries enjoy complete data for HR Scores, Fariss et al.’s (2022) estimates, and virtually all V-Dem covariates, inter alia—yet they crucially lack shapefiles from Natural Earth, which does not offer data on historical geographical boundaries. On account of data availability, I opt to integrate East Germany and South Yemen into my non-spatial models, though this inevitably entails a greater number of cases in such models relative to that of my spatial ones. Always marginal (\(\Delta = 2\)), this difference is only present in models with a pre-1990 start year, before the two countries naturally “drop out” of the dataset.
A further consideration resulting from post-Cold War geopolitical reconfigurations is the existence of several discrepancies in case identification—namely, differences across datasets in the coding of continuities between predecessor and successor states. These present challenges for variable collation and subsequent inference; indeed, when joining datasets on country identifiers (e.g., COW codes), the discrepancies may yield unmerited duplicates or missingness, raising doubts over inferential validity in later modelling stages. To harmonize all inputs of \(\textbf{X}\), I allow decisions adopted by HR Scores and/or V-Dem—the sources of my outcome and the preponderance of covariates—to override alternative coding schemes in the following cases:
- Czechia: coded as the successor of Czechoslovakia.
- Germany: coded as the successor of West Germany.
- Serbia: coded as the successor of Yugoslavia.
- Yemen: coded as the successor of North Yemen.
Having implemented these decisions, I arrive at my final set of cases. In general, the maximum number of countries featuring in the two-way fixed effects models is 176, whereas the equivalent for the spatial models is 174. For a full breakdown on these figures, as well as complete itemizations of the excluded and manually-coded cases discussed above, see the appendices (Section 7.1, Section 7.2, and Section 7.3).
2.3.2 Multiple Imputation
In the next step, I rectify remaining missingness in \(X\) through the method of multiple imputation.